46 research outputs found

    Tight on Budget? Tight Bounds for r-Fold Approximate Differential Privacy

    Get PDF
    Many applications, such as anonymous communication systems, privacy-enhancing database queries, or privacy-enhancing machine-learning methods, require robust guarantees under thousands and sometimes millions of observations. The notion of r-fold approximate differential privacy (ADP) offers a well-established framework with a precise characterization of the degree of privacy after r observations of an attacker. However, existing bounds for r-fold ADP are loose and, if used for estimating the required degree of noise for an application, can lead to over-cautious choices for perturbation randomness and thus to suboptimal utility or overly high costs. We present a numerical and widely applicable method for capturing the privacy loss of differentially private mechanisms under composition, which we call privacy buckets. With privacy buckets we compute provable upper and lower bounds for ADP for a given number of observations. We compare our bounds with state-of-the-art bounds for r-fold ADP, including Kairouz, Oh, and Viswanath\u27s composition theorem (KOV), concentrated differential privacy and the moments accountant. While KOV proved optimal bounds for heterogeneous adaptive k-fold composition, we show that for concrete sequences of mechanisms tighter bounds can be derived by taking the mechanisms\u27 structure into account. We compare previous bounds for the Laplace mechanism, the Gauss mechanism, for a timing leakage reduction mechanism, and for the stochastic gradient descent and we significantly improve over their results (except that we match the KOV bound for the Laplace mechanism, for which it seems tight). Our lower bounds almost meet our upper bounds, showing that no significantly tighter bounds are possible

    S-GBDT: Frugal Differentially Private Gradient Boosting Decision Trees

    Full text link
    Privacy-preserving learning of gradient boosting decision trees (GBDT) has the potential for strong utility-privacy tradeoffs for tabular data, such as census data or medical meta data: classical GBDT learners can extract non-linear patterns from small sized datasets. The state-of-the-art notion for provable privacy-properties is differential privacy, which requires that the impact of single data points is limited and deniable. We introduce a novel differentially private GBDT learner and utilize four main techniques to improve the utility-privacy tradeoff. (1) We use an improved noise scaling approach with tighter accounting of privacy leakage of a decision tree leaf compared to prior work, resulting in noise that in expectation scales with O(1/n)O(1/n), for nn data points. (2) We integrate individual R\'enyi filters to our method to learn from data points that have been underutilized during an iterative training process, which -- potentially of independent interest -- results in a natural yet effective insight to learning streams of non-i.i.d. data. (3) We incorporate the concept of random decision tree splits to concentrate privacy budget on learning leaves. (4) We deploy subsampling for privacy amplification. Our evaluation shows for the Abalone dataset (<4k<4k training data points) a R2R^2-score of 0.390.39 for ε=0.15\varepsilon=0.15, which the closest prior work only achieved for ε=10.0\varepsilon=10.0. On the Adult dataset (50k50k training data points) we achieve test error of 18.7 %18.7\,\% for ε=0.07\varepsilon=0.07 which the closest prior work only achieved for ε=1.0\varepsilon=1.0. For the Abalone dataset for ε=0.54\varepsilon=0.54 we achieve R2R^2-score of 0.470.47 which is very close to the R2R^2-score of 0.540.54 for the nonprivate version of GBDT. For the Adult dataset for ε=0.54\varepsilon=0.54 we achieve test error 17.1 %17.1\,\% which is very close to the test error 13.7 %13.7\,\% of the nonprivate version of GBDT.Comment: The first two authors equally contributed to this wor

    Divide and Funnel: a Scaling Technique for Mix-Networks

    Get PDF
    While many anonymous communication (AC) protocols have been proposed to provide anonymity over the internet, scaling to a large number of users while remaining provably secure is challenging. We tackle this challenge by proposing a new scaling technique to improve the scalability/anonymity of AC protocols that distributes the computational load over many nodes without completely disconnecting the paths different messages take through the network. We demonstrate that our scaling technique is useful and practical through a core sample AC protocol, Streams, that offers provable security guarantees and scales for a million messages. The scaling technique ensures that each node in the system does the computation-heavy public key operation only for a tiny fraction of the total messages routed through the Streams network while maximizing the mixing/shuffling in every round. We demonstrate Streams\u27 performance through a prototype implementation. Our results show that Streams can scale well even if the system has a load of one million messages at any point in time. Streams maintains a latency of 1616 seconds while offering provable ``one-in-a-billion\u27\u27 unlinkability, and can be leveraged for applications such as anonymous microblogging and network-level anonymity for blockchains. We also illustrate by examples that our scaling technique can be useful to many other AC protocols to improve their scalability and privacy, and can be interesting to protocol developers

    (Nothing else) MATor(s): Monitoring the Anonymity of Tor\u27s Path Selection

    Get PDF
    In this paper we present MATor: a framework for rigorously assessing the degree of anonymity in the Tor network. The framework explicitly addresses how user anonymity is impacted by real-life characteristics of actually deployed Tor, such as its path selection algorithm, Tor consensus data, and the preferences and the connections of the user. The anonymity assessment is based on rigorous anonymity bounds that are derived in an extension of the AnoA framework (IEEE CSF 2013). We show how to apply MATor on Tor\u27s publicly available consensus and server descriptor data, thereby realizing the first real-time anonymity monitor. Based on experimental evaluations of this anonymity monitor on Tor Metrics data, we propose an alternative path selection algorithm that provides stronger anonymity guarantees without decreasing the overall performance of the Tor network

    Anonymity Trilemma: Strong Anonymity, Low Bandwidth Overhead, Low Latency---Choose Two

    Get PDF
    This work investigates the fundamental constraints of anonymous communication (AC) protocols. We analyze the relationship between bandwidth overhead, latency overhead, and sender anonymity or recipient anonymity against a global passive (network-level) adversary. We confirm the trilemma that an AC protocol can only achieve two out of the following three properties: strong anonymity (i.e., anonymity up to a negligible chance), low bandwidth overhead, and low latency overhead. We further study anonymity against a stronger global passive adversary that can additionally passively compromise some of the AC protocol nodes. For a given number of compromised nodes, we derive as a necessary constraint a relationship between bandwidth and latency overhead whose violation make it impossible for an AC protocol to achieve strong anonymity. We analyze prominent AC protocols from the literature and depict to which extent those satisfy our necessary constraints. Our fundamental necessary constraints offer a guideline not only for improving existing AC systems but also for designing novel AC protocols with non-traditional bandwidth and latency overhead choices

    A Toolchain for Privacy-Preserving Distributed Aggregation on Edge-Devices

    Full text link
    Valuable insights, such as frequently visited environments in the wake of the COVID-19 pandemic, can oftentimes only be gained by analyzing sensitive data spread across edge-devices like smartphones. To facilitate such an analysis, we present a toolchain for a distributed, privacy-preserving aggregation of local data by taking the limited resources of edge-devices into account. The distributed aggregation is based on secure summation and simultaneously satisfies the notion of differential privacy. In this way, other parties can neither learn the sensitive data of single clients nor a single client's influence on the final result. We perform an evaluation of the power consumption, the running time and the bandwidth overhead on real as well as simulated devices and demonstrate the flexibility of our toolchain by presenting an extension of the summation of histograms to distributed clustering

    MixFlow: Assessing Mixnets Anonymity with Contrastive Architectures and Semantic Network Information

    Get PDF
    Traffic correlation attacks have illustrated challenges with protecting communication meta-data, yet short flows as in messaging applications like Signal have been protected by practical Mixnets such as Loopix from prior traffic correlation attacks. This paper introduces a novel traffic correlation attack against short-flow applications like Signal that are tunneled through practical Mixnets like Loopix. We propose the MixFlow model, an approach for analyzing the unlinkability of communications through Mix networks. As a prominent example, we do our analysis on Loopix. The MixFlow is a contrastive model that looks for semantic relationships between entry and exit flows, even if the traffic is tunneled through Mixnets that protect meta-data like Loopix via Poisson mixing delay and cover traffic. We use the MixFlow model to evaluate the resistance of Loopix Mix networks against an adversary that observes only the inflow and outflow of Mixnet and tries to correlate communication flows. Our experiments indicate that the MixFlow model is exceptionally proficient in connecting end-to-end flows, even when the Poison delay and cover traffic are increased. These findings challenge the conventional notion that adding Poisson mixing delay and cover traffic can obscure the metadata patterns and relationships between communicating parties. Despite the implementation of Poisson mixing countermeasures in Mixnets, MixFlow is still capable of effectively linking end-to-end flows, enabling the extraction of meta-information and correlation between inflows and outflows. Our findings have important implications for existing Poisson-mixing techniques and open up new opportunities for analyzing the anonymity and unlinkability of communication protocols

    AnoA: A Framework For Analyzing Anonymous Communication Protocols

    Get PDF
    Anonymous communication (AC) protocols such as the widely used Tor network have been designed to provide anonymity over the Internet to their participating users. While AC protocols have been the subject of several security and anonymity analyses in the last years, there still does not exist a framework for analyzing complex systems, such as Tor, and their different anonymity properties in a unified manner. In this work we present AnoA: a generic framework for defining, analyzing, and quantifying anonymity properties for AC protocols. In addition to quantifying the (additive) advantage of an adversary in an indistinguishability-based definition, AnoA uses a multiplicative factor, inspired from differential privacy. AnoA enables a unified quantitative analysis of well-established anonymity properties, such as sender anonymity, sender unlinkability, and relationship anonymity. AnoA modularly specifies adversarial capabilities by a simple wrapper-construction, called adversary classes. We examine the structure of these adversary classes and identify conditions under which it suffices to establish anonymity guarantees for single messages in order to derive guarantees for arbitrarily many messages. We coin this condition single-challenge reducability. This then leads us to the definition of Plug\u27n\u27Play adversary classes (PAC), which are easy to use, expressive, and single-challenge reducable. Additionally, we show that our framework is compatible with the universal composability (UC) framework. Leveraging a recent security proof about Tor, we illustrate how to apply AnoA to a simplified version of Tor against passive adversaries
    corecore